In this guide you will acquire the skills needed to process and present spatial data in R. The objectives of the guide are as follows

  1. Understand how spatial data are processed in R
  2. Learn basic data wrangling operations on spatial data
  3. Learn how to make a map in R

This guide focuses exclusively on polygon or areal data. You will have the opportunity to handle and examine point data in Lab 8. This lab guide follows closely and supplements the material presented in Chapters 2.1, 2.2, and 8 in the textbook Geocomputation with R (GWR) and class Handout 5.


Assignment 5 is due by 2:00 pm, May 4th on Canvas. See here for assignment guidelines. You must submit an .Rmd file and its associated .html file. Name the files: yourLastName_firstInitial_asgn05. For example: brazil_n_asgn05.

Open up a R Markdown file


Download the Lab template into an appropriate folder on your hard drive (preferably, a folder named ‘Lab 5’), open it in R Studio, and type and run your code there. The template is also located on Canvas under Files. Change the title (“Lab 5”) and insert your name and date. Don’t change anything else inside the YAML (the stuff at the top in between the ---). Also keep the grey chunk after the YAML. For a rundown on the use of R Markdown, see the assignment guidelines

Installing and loading packages


You’ll need to install the following packages in R. You only need to do this once, so if you’ve already installed these packages, skip the code. Also, don’t put these install.packages() commands in your R Markdown document. Copy and paste the code in the R Console. We’ll talk about what functions these packages provide as they come up in the guide.

install.packages("sf")
install.packages("tigris")
install.packages("tmap")
install.packages("RColorBrewer")

You’ll need to load the following packages using library(). Unlike installing, you will always need to load packages whenever you start a new R session.

library(tidyverse)
library(tidycensus)
library(sf)
library(tigris)
library(tmap)
library(RColorBrewer)

Spatial data in R


The main package we will use for handling spatial data in R is the tidy friendly sf package. sf stands for simple features. What is a feature? A feature is thought of as a thing, or an object in the real world, such as a building or a tree. A county can be a feature. As can a city and a neighborhood. Features have a geometry describing where on Earth the features are located, and they have attributes, which describe other properties. Think back to Lab 3 - we were working with counties. The difference between what we were doing then and what we will be doing in this lab is that counties in Lab 3 had attributes (e.g. percent Hispanic, total population), but they did not have geometries. As such, we could not put them on a map because we didn’t have their specific geographic coordinates. This is what separates nonspatial and spatial data in R.

Bringing in spatial data


sf is the specific type of data object that deals with spatial information in R. Think back to Lab 1 when we discussed the various ways R stores data - sf is just another way. But please note that spatial data themselves outside of R can take on many different formats. We’ll be primarily working with shapefiles in this class. Shapefiles are not the only type of spatial data, but they are the most commonly used. Let’s be clear here: sf objects are R specific and shapefiles are a general format of spatial data. This is like tibbles are R specific and csv files are a general format of non spatial data.

We will be primarily working with census geographic data in this lab and pretty much all future labs. If you need a reminder of the Census geographies, go back to Handout 3. There are two major packages for bringing in Census shapefiles into R: tidycensus and tigris.

tidycensus


In Lab 3, we worked with the tidycensus package and the Census API to bring in Census data into R. Fortunately, we can use the same commands to bring in Census geographic data. First, load in your Census API key. If you already installed your API key in a past lab using install = TRUE in census_api_key(), skip this step.

census_api_key("YOUR API KEY GOES HERE", install = TRUE)

Then use the get_acs() command to bring in California tract-level median household income, total foreign-born population, and total population. Remember that “E” at the end of the variable indicates “Estimate” and “M” indicates margin of errors.

ca.tracts <- get_acs(geography = "tract", 
              year = 2019,
              variables = c(medincome = "B19013_001", 
                            fb = "B05012_003", totp = "B05012_001"), 
              state = "CA",
              survey = "acs5",
              output = "wide",
              geometry = TRUE)

The only difference between the code above and what we used in Lab 3 is we have one additional argument added to the get_acs() command: geometry = TRUE. This command tells R to bring in the spatial features associated with the geography you specified in the command, in our case California tracts. You can further narrow your geographic scope to the county level by typing in county = as an argument. For example, to get just Sacramento county tracts, you would type in county = "Sacramento". Type in ca.tracts to see what we’ve got.

ca.tracts
## Simple feature collection with 8057 features and 8 fields (with 16 geometries empty)
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -124.4096 ymin: 32.53416 xmax: -114.1312 ymax: 42.00948
## Geodetic CRS:  NAD83
## First 10 features:
##          GEOID                                                 NAME medincomeE
## 1  06013370000   Census Tract 3700, Contra Costa County, California     102450
## 2  06001442301     Census Tract 4423.01, Alameda County, California     110761
## 3  06037405101 Census Tract 4051.01, Los Angeles County, California      78667
## 4  06037199800    Census Tract 1998, Los Angeles County, California      37755
## 5  06037291300    Census Tract 2913, Los Angeles County, California      81281
## 6  06037292000    Census Tract 2920, Los Angeles County, California      42135
## 7  06037604002 Census Tract 6040.02, Los Angeles County, California      61607
## 8  06037920035 Census Tract 9200.35, Los Angeles County, California      76992
## 9  06029002804          Census Tract 28.04, Kern County, California      55880
## 10 06107001800           Census Tract 18, Tulare County, California      49917
##    medincomeM  fbE fbM totpE totpM                       geometry
## 1       10160  645 189  2850   198 MULTIPOLYGON (((-122.327 37...
## 2       21966 2928 321  5496   299 MULTIPOLYGON (((-121.9701 3...
## 3       10132 2488 426  5617   590 MULTIPOLYGON (((-117.9693 3...
## 4        5885 2935 416  5828   492 MULTIPOLYGON (((-118.2156 3...
## 5       11317  685 142  3037   150 MULTIPOLYGON (((-118.3091 3...
## 6        9442 2681 428  6567   530 MULTIPOLYGON (((-118.3091 3...
## 7        7872 1843 344  4856   437 MULTIPOLYGON (((-118.3613 3...
## 8       10796 1997 386  6895   550 MULTIPOLYGON (((-118.4722 3...
## 9        9062  171  76  2510   201 MULTIPOLYGON (((-119.0745 3...
## 10       8683  383 179  4608   473 MULTIPOLYGON (((-119.314 36...


The object looks much like a basic tibble, but with a few differences.

  • You’ll find that the description of the object now indicates that it is a simple feature collection with 8 fields (attributes or columns of data). There are 8057 features, in this case, census tracts.
  • The geometry_type indicates that the spatial data are in MULTIPOLYGON form (as opposed to points or lines, the other basic vector data forms, which were discussed in Handout 5).
  • bbox stands for bounding box, which indicates the spatial extent of the features (from left to right, for example, California tracts go from a longitude of -124.4096 to -114.1312).
  • epsg and proj4string are related to the coordinate reference system, which we’ll touch on later in the quarter.
  • The final difference is that the data frame contains the column geometry. This geometry is what makes this data frame spatial. Remember that a tibble is a data frame. Hence, an sf objective is basically a tibble, or has tibble like qualities. This means that we can use nearly all of the functions we’ve learned in the past three labs on sf objects. Hooray for consistency!

tigris package


Another package that allows us to bring in census geographic boundaries is tigris. Here is a list of all the geographies you can download through this package. Let’s bring in the boundaries for Sacramento city. Remember from Handout 3 that cities are designated as places by the Census. Use the places() function to get all places in California.

pl <- places(state = "CA", cb = TRUE, year=2019)


The cb = TRUE argument tells R to download a generalized cartographic boundary file, which drastically reduces the size of the data (compare the file size when you don’t include cb = TRUE). For example, it eliminates all areas that are strictly covered by water (e.g. lakes). The argument year=2019 tells R to bring in the boundaries for that year (census geographies can change from year to year). When using the multi-year ACS, best to use the end year of the period. In the get_acs() command above we used year=2019, so also use year=2019 in the places() command. Note that unlike the tidycensus package, tigris does not allow you to attach attribute data (e.g. percent Hispanic, total population, etc.) to geometric features.

We can use filter() to keep Sacramento city.

sac.city <- filter(pl, NAME == "Sacramento")
sac.city

The argument NAME == "Sacramento" tells R to keep cities with the exact city name “Sacramento”.

Let’s use use the function counties() to bring in county boundaries.

counties <- counties(cb = TRUE, year=2019, state = "CA")

To get Sacramento county, we use the filter() function.

sac.county <- filter(counties, NAME == "Sacramento")

Guess what? You earned another badge! Yipee!!

Reading from your hard drive


Directly reading spatial files using an API is great, but doesn’t exist for many spatial data sources. You’ll often have to download a spatial data set, save it onto your hard drive and read it into R. The function for reading spatial files from your hard drive as sf objects is st_read().

Let’s bring in two shapefiles I created that contains (1) median housing values for census tracts in Sacramento county and (2) Sacramento county parks. I zipped up the file and uploaded it onto Github. Make sure your current working directory is pointed to the appropriate folder on your hard drive (use setwd()). Use the following code to download and unzip the file.

download.file(url = "https://raw.githubusercontent.com/crd150/data/master/lab5files.zip", destfile = "lab5files.zip")
unzip(zipfile = "lab5files.zip")


Don’t worry if you don’t understand these commands - they are more for you to simply copy and paste so that you can download files that I zipped up and uploaded onto Github. You can look at the help documentation for each function if you are curious.

You should see SacramentoCountyTracts and Parks files in your current working directory (type in getwd() to find where these files reside on your hard drive). Note that the shapefile is actually not a single file but is represented by multiple files. For SacramentoCountyTracts, you should see four files named SacramentoCountyTracts with shp, dbf, prj, and shx extensions. These files are all connected to one another, so don’t manually alter these files. Moreover, if you want to remove a shapefile from your hard drive, delete all the associated files not just one. For Parks, you will see six associated files.

Bring in the Sacramento County tract shapefile using the function st_read(). You’ll need to add the .shp extension so that the function knows it’s reading in a shapefile.

sac.county.tracts <- st_read("SacramentoCountyTracts.shp", stringsAsFactors = FALSE)

The argument stringsAsFactors = FALSE tells R to keep any variables that look like a character as a character and not a factor, which we won’t use much, if at all, in this class.

Bring in the parks file.

parks <- st_read("Parks.shp", stringsAsFactors = FALSE)

Data Wrangling


There is a lot of stuff behind the curtain of how R handles spatial data as simple features, but the main takeaway is that sf objects are data frames. This means you can use many of the functions we’ve learned in the past couple labs to manipulate sf objects, and this includes our best buddy the pipe %>% operator. For example, let’s do the following data wrangling tasks on ca.tracts.

  1. Drop the margins of errors
  2. Rename the variables
  3. Calculate percent foreign born

We do all of this in one line of continuous code using the pipe operator %>%

ca.tracts <- ca.tracts %>%
            select(-medincomeM, -fbM, -totpM) %>%
            rename(medincome = medincomeE, fb = fbE, totp = totpE) %>%
            mutate(pfb = fb/totp)

Notice that we’ve already used all of the functions above for nonspatial data wrangling. Another important operation is to join attribute data to an sf object. For example, let’s say you wanted to add tract level percent race/ethnicity, which is located in a csv file I’ve uploaded on GitHub

ca.race <- read_csv("https://raw.githubusercontent.com/crd150/data/master/californiatractsrace.csv")

Remember, were dealing with data frames here, so we can use left_join(), which we covered in Lab 3, to join the non spatial data frame ca.race to the spatial data frame sac.county.tracts.

sac.county.tracts <- sac.county.tracts %>%
  left_join(ca.race, by = "GEOID")
sac.county.tracts
## Simple feature collection with 317 features and 7 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -121.8625 ymin: 38.01842 xmax: -121.0271 ymax: 38.7364
## Geodetic CRS:  NAD83
## First 10 features:
##          GEOID                                              NAME medhval
## 1  06067001101 Census Tract 11.01, Sacramento County, California  344600
## 2  06067002700    Census Tract 27, Sacramento County, California  231100
## 3  06067004201 Census Tract 42.01, Sacramento County, California  207100
## 4  06067004402 Census Tract 44.02, Sacramento County, California  142100
## 5  06067004906 Census Tract 49.06, Sacramento County, California  144400
## 6  06067005510 Census Tract 55.10, Sacramento County, California  105300
## 7  06067005903 Census Tract 59.03, Sacramento County, California  336600
## 8  06067006800    Census Tract 68, Sacramento County, California  160800
## 9  06067007107 Census Tract 71.07, Sacramento County, California  439200
## 10 06067007428 Census Tract 74.28, Sacramento County, California  212100
##     pnhwhite     pnhasn     pnhblk     phisp                       geometry
## 1  0.5966425 0.06306715 0.14246824 0.1202359 POLYGON ((-121.4989 38.5780...
## 2  0.3511628 0.05203488 0.22965116 0.3011628 POLYGON ((-121.4758 38.5572...
## 3  0.1288228 0.15342559 0.21219887 0.4703571 POLYGON ((-121.5061 38.4956...
## 4  0.1166777 0.21819380 0.13315755 0.4689079 POLYGON ((-121.4648 38.5395...
## 5  0.1098211 0.24631315 0.25007844 0.2899278 POLYGON ((-121.4645 38.4814...
## 6  0.4553386 0.05754051 0.16368924 0.2702534 POLYGON ((-121.4107 38.5888...
## 7  0.6067044 0.03918791 0.15108593 0.1180359 POLYGON ((-121.3553 38.6174...
## 8  0.1407068 0.07041885 0.10431937 0.6077225 POLYGON ((-121.4676 38.6197...
## 9  0.3832402 0.21927374 0.11899441 0.1670391 POLYGON ((-121.6181 38.6733...
## 10 0.7104059 0.08090337 0.03859348 0.1040595 POLYGON ((-121.357 38.70376...

The main takeaway: sf objects are data frames, so you can use many of the functions you’ve learned in the past couple of labs on these objects.

Saving shapefiles


To save an sf object to a file, use the function st_write() and specify at least two arguments, the sf object you want to save and a file name in quotes with the file extension. You’ll also need to specify delete_layer = TRUE which overwrites the existing file if it already exists in your current working directory. Make sure you’ve set your directory to the folder you want your file to be saved in. Type in getwd() to see your current directory and use setwd() to set the directory.

Let’s save sac.county.tracts as a shapefile named saccountytractslab5.shp.

st_write(sac.county.tracts, "saccountytractslab5.shp", delete_layer = TRUE)

Check your current working directory to see if the file saccountytractslab5.shp was saved.

You can save your sf object in a number of different data formats other than shp. We won’t be concerned too much with these other formats in this class, but you can see a list of them here.

Mapping in R


Now that you’ve got your spatial data in and wrangled, the next natural step is to map something. There are several functions in R that can be used for mapping. We won’t go through all of them, but GWR outlines in Table 8.1 the range of mapping packages available in R. The package we’ll rely on in this class for mapping is tmap.

The foundation function for mapping in tmap is tm_shape(). You then build on tm_shape() by adding one or more elements, all taking on the form of tm_. Let’s make a choropleth map of median housing values.

tm_shape(sac.county.tracts) +
  tm_polygons(col = "medhval", style = "quantile")

You first put the dataset sac.county.tracts inside tm_shape(). Because you are plotting polygons, you use tm_polygons() next. If you are plotting points, you will use tm_dots(). If you are plotting lines, you will use tm_lines(). The argument col = "medhval" tells R to shade the tracts by the variable medhval. tmap allows users to specify the classification style with the style argument. The argument style = "quantile" tells R to break up the shading into quantiles, or equal groups of 5. Seven of the most useful classification styles are described in the bullet points below (taken from GWR):

  • style = pretty, the default setting, rounds breaks into whole numbers where possible and spaces them evenly
  • style = equal divides input values into bins of equal range, and is appropriate for variables with a uniform distribution (not recommended for variables with a skewed distribution as the resulting map may end-up having little color diversity)
  • style = quantile ensures the same number of observations fall into each category (with the potential down side that bin ranges can vary widely)
  • style = jenks identifies groups of similar values in the data and maximizes the differences between categories
  • style = cont (and order) present a large number of colors over continuous color field, and are particularly suited for continuous rasters (order can help visualize skewed distributions)
  • style = sd divides the values by standard deviations above and below the mean.
  • style = cat was designed to represent categorical values and assures that each category receives a unique color

The importance of choosing the appropriate classification scheme is discussed in Handout 5. You’ll get some practice trying out other classification schemes in this week’s assignment.

You can overlay multiple features on one map. For example, we can add park polygons on top of county tracts, providing a visual association between parks and percent white. Here, we add another tm_shape() and tm_polygons() to the above code.

tm_shape(sac.county.tracts) +
  tm_polygons(col = "medhval", style = "quantile") +
  tm_shape(parks) +
    tm_polygons(col = "green")

Color Scheme


Don’t like the yellow/brown color scheme? We can change the color scheme using the argument palette = within tm_polygons(). The argument palette = defines the color ranges associated with the bins as determined by the style argument. Below we use the color scheme “Reds” using style = "quantile".

tm_shape(sac.county.tracts) +
  tm_polygons(col = "medhval", style = "quantile",palette = "Reds") 

See Ch. 8.2.4 in GWR for a fuller discussion on color and other schemes you can specify.

In addition to the built-in palettes, customized color ranges can be created by specifying a vector with the desired colors as anchors. This will create a spectrum of colors in the map that range between the colors specified in the vector. For instance, if we used c(“red”, “blue”), the color spectrum would move from red to purple, then to blue, with in between shades. In our example:

tm_shape(sac.county.tracts) +
  tm_polygons(col = "medhval", style = "quantile",palette =  c("red","blue")) 

Not exactly a pretty picture. In order to capture a diverging scale, we insert “white” in between red and blue.

tm_shape(sac.county.tracts) +
  tm_polygons(col = "medhval", style = "quantile",palette = c("red","white", "blue")) 

A preferred approach to select a color palette is to chose one of the schemes contained in the RColorBrewer package. These are based on the research of cartographer Cynthia Brewer (see the colorbrewer2 web site for details). ColorBrewer makes a distinction between sequential scales (for a scale that goes from low to high), diverging scales (to highlight how values differ from a central tendency), and qualitative scales (for categorical variables). For each scale, a series of single hue and multi-hue scales are suggested. In the RColorBrewer package, these are referred to by a name (e.g., the “Reds” palette we used above is an example). The full list is contained in the RColorBrewer documentation.

There are two very useful commands in this package. One sets a color palette by specifying its name and the number of desired categories. The result is a character vector with the hex codes of the corresponding colors.

For example, we select a sequential color scheme going from blue to green, as BuGn, by means of the command brewer.pal, with the number of categories (6) and the scheme as arguments. The resulting vector contains the HEX codes for the colors.

pal <- brewer.pal(6,"BuGn")
pal
## [1] "#EDF8FB" "#CCECE6" "#99D8C9" "#66C2A4" "#2CA25F" "#006D2C"

Using this palette in our map yields the following result.

tm_shape(sac.county.tracts) +
  tm_polygons(col = "medhval", style = "quantile", palette="BuGn") 

The command display.brewer.pal() allows us to explore different color schemes before applying them to a map. For example:

display.brewer.pal(6,"BuGn")

Legend


There are many options to change the formatting of the legend. Often, the automatic title for the legend is not intuitive, since it is simply the variable name (in our case, medhval). This can be customized by setting the title argument in tm_polygons(). Let’s change the legend title to “Housing values”

tm_shape(sac.county.tracts, unit = "mi") +
  tm_polygons(col = "medhval", style = "quantile",palette = "Reds", 
              title = "Housing values") 

Another important aspect of the legend is its positioning. This is handled through the tm_layout() function. This function has a vast number of options, as detailed in the documentation. Also check the help documentation for tm_layout() to see the complete list of settings and examples in Ch. 8.2.5 in GWR. There are also specialized subsets of layout functions, focused on specific aspects of the map, such as tm_legend(), tm_style() and tm_format(). We illustrate the positioning of the legend.

The default is to position the legend inside the map. Often, this default solution is appropriate, but sometimes further control is needed. The legend.position argument in the tm_layout() function moves the legend around the map, and it takes on a vector of two string variables that determine both the horizontal position (“left”, “right”, or “center”) and the vertical position (“top”, “bottom”, or “center”). The default is “right” and “bottom”. But, we can change it to, say, top right.

tm_shape(sac.county.tracts, unit = "mi") +
  tm_polygons(col = "medhval", style = "quantile",palette = "Reds",
              title = "Housing values") +
  tm_layout(legend.position =  c("right", "top"))

Yuck. We can leave it at the bottom right. Or there is also the option to position the legend outside the frame of the map. This is accomplished by setting legend.outside to TRUE (the default is FALSE) in tm_layout().

tm_shape(sac.county.tracts, unit = "mi") +
  tm_polygons(col = "medhval", style = "quantile",palette = "Reds", 
              title = "Housing values") +
    tm_layout(legend.outside = TRUE)

We can also customize the size of the legend, its alignment, font, etc. We refer to GWR for specifics.

Title


Another functionality of the tm_layout() function is to set a title for the map, and specify its position, size, etc. For example, we can set the title using main.title, and the size using main.title.size as in the example below. We made the font size a bit smaller (0.95) in order not to overwhelm the map.

tm_shape(sac.county.tracts, unit = "mi") +
  tm_polygons(col = "medhval", style = "quantile",palette = "Reds",
              title = "Housing values") +
    tm_layout(main.title = "2015-19 Median Housing Values in Sacramento County",
              main.title.size = 0.95, legend.outside = TRUE)

You can change the title position using main.title.position. For example, we center the title

tm_shape(sac.county.tracts, unit = "mi") +
  tm_polygons(col = "medhval", style = "quantile",palette = "Reds",
              title = "Housing values") +
    tm_layout(main.title = "2015-19 Median Housing Values in Sacramento County",
              main.title.size = 0.95, main.title.position="center", 
              legend.outside = TRUE)

Scale bar and arrow


We need to add the other key map elements described in Handout 5. Here is where we start adding layout functions after tm_polygons() using the + operator. First, the scale bar, which you can add using the function tm_scale_bar()

tm_shape(sac.county.tracts, unit = "mi") +
tm_polygons(col = "medhval", style = "quantile",palette = "Reds",
              title = "Housing values") +
    tm_layout(main.title = "2015-19 Median Housing Values in Sacramento County",
              main.title.size = 0.95, main.title.position="center", 
              legend.outside = TRUE) +
  tm_scale_bar(breaks = c(0, 5, 10), text.size  = 0.75, 
               position = c("right", "bottom")) 

The argument breaks within tm_scale_bar() tells R the distances to break up and end the bar. Make sure you use reasonable break points - the Sacramento county area is not, for example, 200 miles wide, so you should not use c(0,100,200) (try it and see what happens. You won’t like it). Note that the scale is in miles (were in America!). The default is in kilometers (the rest of the world!), but you can specify the units within tm_shape() using the argument unit. Here, we used unit = "mi" to designate distance in the scale bar measured in miles. The position = argument locates the scale bar on the bottom right of the map. The argument text.size = controls the size of the scale bar. We decrease the size by 25%.


The next element is the north arrow, which we can add using the function tm_compass(). You can control for the type, size and location of the arrow within this function. We place a 4-star arrow on the top left of the map.

tm_shape(sac.county.tracts, unit = "mi") +
tm_polygons(col = "medhval", style = "quantile",palette = "Reds",
              title = "Housing values") +
    tm_layout(main.title = "2015-19 Median Housing Values in Sacramento County",
              main.title.size = 0.95, main.title.position="center", 
              legend.outside = TRUE) +
  tm_scale_bar(breaks = c(0, 5, 10), text.size  = 0.75, 
               position = c("right", "bottom"))  +
  tm_compass(type = "4star", position = c("left", "top")) 

Other features


We can make the map prettier by changing a variety of settings. We can eliminate the frame around the map using the argument frame = FALSE with tm_layout. We also add back the parks.

sac.map <- tm_shape(sac.county.tracts, unit = "mi") +
tm_polygons(col = "medhval", style = "quantile",palette = "Reds",
              title = "Housing values") +
    tm_layout(main.title = "2015-19 Median Housing Values in Sacramento County",
              main.title.size = 0.95, main.title.position="center", 
              legend.outside = TRUE, frame = FALSE, ) +
  tm_scale_bar(breaks = c(0, 5, 10), text.size  = 0.75, 
               position = c("right", "bottom"))  +
  tm_compass(type = "4star", position = c("left", "top")) +  
  tm_shape(parks) +
    tm_polygons(col = "green")

sac.map

Notice that we stored the map into an object called sac.map. R is an object-oriented language, so everything you make in R are objects that can be stored for future manipulation. This includes maps. You should see sac.map in your Environment window. By storing the map, you can access it anytime during your current R session.

Check the full list of tm_ elements here.

Saving maps


You can save your maps a couple of ways.

  1. On the plotting screen where the map is shown, click on Export and save it as either an image or pdf file.
  2. Use the function tmap_save()

For option 2, we can save the map object sac.map as such

tmap_save(sac.map, "saccountyhval.png")

Specify the tmap object and a filename with an extension. It supports pdf, eps, svg, wmf, png, jpg, bmp and tiff. The default is png. Also make sure you’ve set your directory to the folder that you want your map to be saved in.

Interactive maps


So far we’ve created static maps. That is, maps that don’t “move”. But, we’re all likely used to Google or Bing maps - maps that we can move around and zoom into. You can make interactive maps in R using the package tmap.

To make your tmap object interactive, use the function tmap_mode(). Type in “view” inside this function.

tmap_mode("view")

Now that the interactive mode has been ‘turned on’, all maps produced with tm_shape() will launch. Let’s view our saved sac.map interactively.

sac.map

Click on above the map and a larger window should open up.

Besides interactivity, another important benefit of tmap_mode() is that it provides a basemap, which was discussed in Handout 5. The function of a basemap is to provide background detail necessary to orient the location of the map. In the static maps we produced earlier, Sacramento county was sort of floating in white space. As you can see in the interactive map above we’ve added geographic context to the surrounding area.

The default basemap in tmap_mode() is CartoDB.Positron. You can change the basemap through the tm_basemap() function. For example, let’s change the basemap to an OpenStreetMap.

sac.map + tm_basemap("OpenStreetMap")

For a complete list of basemaps with previews, see here. There are a lot of cool ones, so please test them out.

You can save your interactive map using the same methods described for static maps. To switch back to plotting mode (noninteractive), type in

tmap_mode("plot")

You’ve completed your introduction to sf. Whew! Badge? Yes, please, you earned it! Time to celebrate!

Assignment 5


Download and open the Assignment 5 R Markdown Script. The script can also be found on Canvas (Files - Week 5 - Assignment). Any response requiring a data analysis task must be supported by code you generate to produce your result. Just examining your various objects in the “Environment” section of R Studio is insufficient—you must use scripted commands. Submit the Rmd and its knitted html files on Canvas.


  1. One can easily lie (or distort the story) with maps. One of the commonly used tricks for misrepresenting the spatial distribution of phenomena relates to the inappropriate classifications of numeric variables. In the lab guide, we showed quantile breaks of median income in Sacramento. Bring into R the object saccountytractslab5.shp you saved in Lab and produce static maps using three other classifications. Make sure to include all the map features described in the lab (legend, title, etc.). Does your general impression of where high and low income neighborhoods are located in the city change across the different classifications? (6 points)

  2. We are in the midst of a global health crisis marked by profound human suffering. Analyzing the demographic, health and job characteristics of communities reveals that some places are disproportionately affected by this suffering. Communities with no or unstable internet access are at a disadvantage given that employment, medical care (telemedicine), education and other important services are primarily conducted online at home. Let’s visually examine the association between internet access and race/ethnicity in Sacramento county census tracts.

  1. Using the Census API, bring in 2015-2019 total, non-Hispanic white, non-Hispanic black, non-Hispanic Asian, and Hispanic populations for census tracts in Sacramento county. Check Lab 3 for the appropriate variable IDs. Make sure to bring in spatial data by using geometry = TRUE. Calculate the percent Asian, white, black and Hispanic variables. (2 points)
  2. Bring into R the 2015-2019 percent of households without a broadband, cable, Fiber Optic, or DSL internet connection for Sacramento county census tracts from PolicyMap (from the PolicyMap site, click on Health, Computer and Internet Access under the COVID-19 banner, Without High Speed Internet Access, and No Cable, Fiber Optic or DSL). Make sure to clean the data. (2 points)
  3. Merge the percent of households without a broadband, cable, Fiber Optic, or DSL internet connection to the sf census tracts object you created in (a). (1 point)
  4. Create maps of percent non-Hispanic black, non-Hispanic white, non-Hispanic Asian, and Hispanic using quantile breaks. (6 points)
  5. Create a map of the percent of households without a broadband, cable, Fiber Optic, or DSL internet connection using quantile breaks. (3 points)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Website created and maintained by Noli Brazil